Private Sampling: A Noiseless Approach for Generating Differentially Private Synthetic Data

نویسندگان

چکیده

In a world where artificial intelligence and data science become omnipresent, sharing is increasingly locking horns with data-privacy concerns. Differential privacy has emerged as rigorous framework for protecting individual in statistical database, while releasing useful information about the database. The standard way to implement differential inject sufficient amount of noise into data. However, addition other limitations privacy, this process adding will affect accuracy utility. Another approach enable based on concept synthetic goal create an as-realistic-as-possible dataset, one that not only maintains nuances original data, but does so without risk exposing sensitive information. combination been suggested best-of-both-worlds solutions. work, we propose first noisefree method construct differentially private data; do through mechanism called sampling. Using Boolean cube benchmark model, derive explicit bounds constructed key mathematical tools are hypercontractivity, duality, empirical processes. A core ingredient our sampling marginal correction method, which remarkable property importance reweighting can be utilized exactly match marginals sample population.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differentially Private Local Electricity Markets

Privacy-preserving electricity markets have a key role in steering customers towards participation in local electricity markets by guarantying to protect their sensitive information. Moreover, these markets make it possible to statically release and share the market outputs for social good. This paper aims to design a market for local energy communities by implementing Differential Privacy (DP)...

متن کامل

Generating Differentially Private Datasets Using GANs

In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then us...

متن کامل

Generating Differentially Private Datasets Using Gans

In this paper, we present a technique for generating artificial datasets that retain statistical properties of the real data while providing differential privacy guarantees with respect to this data. We include a Gaussian noise layer in the discriminator of a generative adversarial network to make the output and the gradients differentially private with respect to the training data, and then us...

متن کامل

PCPs and the Hardness of Generating Private Synthetic Data

Assuming the existence of one-way functions, we show that there is no polynomial-time, differentially private algorithm A that takes a database D ∈ ({0, 1}) and outputs a “synthetic database” D̂ all of whose two-way marginals are approximately equal to those of D. (A two-way marginal is the fraction of database rows x ∈ {0, 1} with a given pair of values in a given pair of columns.) This answers...

متن کامل

Differentially Private Trajectory Data Publication

With the increasing prevalence of location-aware devices, trajectory data has been generated and collected in various application domains. Trajectory data carries rich information that is useful for many data analysis tasks. Yet, improper publishing and use of trajectory data could jeopardize individual privacy. However, it has been shown that existing privacy-preserving trajectory data publish...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: SIAM journal on mathematics of data science

سال: 2022

ISSN: ['2577-0187']

DOI: https://doi.org/10.1137/21m1449944